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Abstract. A theorem of J. Kruskal from 1977, motivated by a latent-class 
statistical model, established that under certain explicit conditions the ex- 
pression of a 3-dimensional tensor as the sum of rank-1 tensors is essentially 
unique. We give a new proof of this fundamental result, which is substantially 
shorter than both the original one and recent versions along the original lines. 

1. Introduction 

In [10j . J. Kruskal proved that, under certain explicit conditions, the expression 
of a 3-dimensional tensor (i.e., a 3- way array) of rank r as a sum of r tensors of 
rank 1 is unique, up to permutation of the summands. (See also [HIE!]-) This result 
contrasts sharply with the well-known non-uniqueness of expressions of matrices of 
rank at least 2 as sums of rank-1 matrices. The uniqueness of this tensor decom- 
position is moreover of fundamental interest for a number of applications, ranging 
from Kruskal's original motivation by latent-class models used in psychometrics, to 
chemistry and signal processing, as mentioned in [11] and its references. In these 
fields, the expression of a tensor as a sum of rank-1 tensors is often referred to as the 
Candecomp or Parafac decomposition. Recently, Kruskal's theorem has been used 
as a general tool for investigating the identifiability of a wide variety of statistical 
models with hidden variables [U [2] ■ 

As noted in [IT], Kruskal's original proof was "rather inaccessible," leading a 
number of authors to work toward a shorter and more intuitive presentation. This 
thread, which continued to follow the basic outline of Kruskal's approach in which 
his 'Permutation Lemma' plays a key role, culminated in the proof given in [IT] . In 
this paper, we present a new and more concise proof of Kruskal's theorem, Theorem 
[3] below, that follows an entirely different approach. While the resulting theorem is 
identical, the alternative argument given here offers a new perspective on the role 
of Kruskal's explicit condition ensuring uniqueness. 

While Kruskal's theorem gives a sufficient condition for uniqueness of a decom- 
position, the condition is known in general not to be necessary. Of particular note 
are recent independent works of De Lathauwer [4] and Jiang and Sidiropoulos [6] , 
which give a different, though in some ways more narrow, criterion that can ensure 
uniqueness. See also [H] for the connection between these works. 
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It would, of course, be highly desirable to obtain conditions (more involved than 
Kruskal's) that would ensure the essential uniqueness of the expression of a rank 
r tensor as a sum of rank-1 tensors under a wider range of assumptions on the 
size and rank of the tensor. Note that both Kruskal's condition and that of [H [6] 
can be phrased algebraically, in terms of the non-vanishing of certain polynomials 
in the variables of a natural parameterization of rank r tensors. This algebraic 
formulation allows one to conclude that generic rank r tensors of certain sizes 
have unique decompositions. Having explicit understanding of these polynomial 
conditions is essential for certain applications, such as in [T]. The general problem 
of determining for which sizes and ranks of generic tensors the decomposition is 
essentially unique, and what explicit algebraic conditions can ensure uniqueness, 
remains open. 

2. Notation 

Throughout, we work over an arbitrary field. 

For a matrix such as Mk, we use to denote the jth column, m^' to denote 
the ith row, and m% the («,j)th entry. We use (S) to denote the span of a set of 
vectors S. With [r] = {1,2,3,..., r}, we denote by & r the symmetric group on [r]. 

Given matrices Mj of size si x r, the matrix triple product [Mi, M2, M3] is an 
si x s 2 x S3 tensor defined as a sum of r rank-1 tensors by 

r 

[Mi, M 2 , M a ] = X) m i ® m i ® mf , 

i=l 

SO 

r 

[Mi, M 2 , M 3 ](j, k,l) = Y, m] im 2 M ml 

i=l 

A matrix A of size txsi acts on an si x s 2 x S3 tensor T 'in the Zth coordinate.' 
For example, with I = 1 

si 

(A*i T)(i,j,k) = ^a m T{n,j,k), 
n=l 

so that A *i T is of size f x sj X S3. One then easily checks that 

A *i [Mi, M 2 , M 3 ] = [AAfi, M 2 , M 3 ], 

with similar formulas applying for actions in other coordinates. 

Definition. The Kruskal rank, or K-rank, of a matrix is the largest number j such 
that every set of j columns is independent. 

Definition. We say a triple of matrices (Mi,M 2 ,M3) is of type (r; 01,02,03) if 
each Mi has r columns and the K-rank of Mj is at least r — <ij. 

In a slight abuse of notation, we will say a product [Mi,Ma,M3] is of type 
(r; ai, a 2 , 03) when the triple (Mi, M 2 , M3) is of that type. 

Note that with this definition, type (r; 01,03,03) implies type (r; 61, 6^63) as 
long as at < b. L for each i. Thus a, is a bound on the gap between the K-rank of 
the matrix Mj and the number r of its columns. Intuitively, when the are small 
it should be easier to identify the Mj from the product [Mi, M 2 , M3]. 

We will not need to be explicit about the number of rows in any of the Mj, 
though type (r; 01, a 2 , 03) of course implies Mj has at least r — a, rows 
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3. The proof 

We begin by establishing a lemma that generalizes a basic insight that has been 
rediscovered many times over the years, in which matrix diagonalizations arising 
from matrix slices of a 3-dimensional tensor are used to understand the tensor 
decomposition. A few such instances of the appearance of this idea include [3l 
[7], and other such references are mentioned in [5] where the idea is exploited for 
computational purposes. 

Lemma 1. Suppose (Mi, M2, M3) is of type (r; 0, 0, r — 1); Ni, N2, N3 are matrices 
with r columns; and [Mi, M2, M3] = [N%, N%, N3]. Then there is some permutation 
a G & r such that the following holds: 

Let I C [r] be any maximal subset (with respect to inclusion) of indices with the 
property that ({mf}i e j) is 1-dimensional. Then 

(1) ({mjhei) = (K (i) } ieI ) } for j - 1,2,3 and 

(2) X is also maximal for the property that ({njL^ }iex) is 1-dimensional. 

Proof. That (Mi, Ma, M3) is of type (r; 0, 0, r — 1) means Mi, M2 have full column 
rank, and M3 has no zero columns. 

Choose some vector c that is not orthogonal to any of the columns of M3, so 
that c r M 3 has no zero entries. Then 

A = c T * 3 [Mi, M 2 , M a ] = [M 1 ,M 2 ,c T M 3 ] = Mi diag(c T M 3 )M 2 T 

is a matrix of rank r. Since 

A = c T * 3 [Ni,N 2 , N 3 ] = [Ni,N 2) c T N 3 ] = N x diag(c T iV3)iV 2 T , 

Ni and N2 must also have rank r, and c T N3 has no zero entries. These two 
expressions for A also show that the span of the columns of Mj is the same as that 
of the columns of Nj for j = 1,2. Expressing the columns of Mj and Nj in terms 
of a basis given by the columns of Mj , we may henceforth assume Mi = M 2 = I r , 
the r x r identity, and Ni, N2 are invertible. Thus A = diag(c T M3). 

Now let Si denote the slice of [Mi,M 2 ,M 3 ] = [Ni,N 2 ,N 3 ] with fixed third 
coordinate i, so Si is anrXr matrix. Recalling that m^ and fi\ denote the «th 
rows of Mj and Nj , we have 

^=diag( m 3) = iVidiag(n3)7V 2 T . 

Note the matrices 

S l A~ 1 = diag(mf ) diag^Mg)- 1 = Ni diag(n?) diag(c T iV3)- 1 iV 1 - 1 , 

for various choices of i, commute. Thus their (right) simultaneous eigenspaces are 
determined. But from the two expressions for SiA -1 we see its a-eigenspace is 
spanned by the set 

{ ej = m] I mlj/(c T m 3 j) = a}, 

and also by the set 

{nj I nlj/(e T n 3 j) = a}. 

A simultaneous eigenspace for the SiA~ x is thus spanned by the set {ej}j e x 
where T is a maximal set of indices with the property that if j, k G X, then 

m? J -/(c T mf) - mf )fc /(c T m^), for all i. 
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This condition is equivalent to m| and m| being scalar multiples of one another. 
Such a set T is therefore exactly of the sort described in the statement of the lemma. 
As the simultaneous eigenspaces are also spanned by similar sets defined in terms 
of the columns of N\ , one may choose a permutation er so that claim [2] holds, as 
well as claim [T] for j = 1. 

The case j = 2 of claim 1 is similarly proved using the transposes of A and 
the Si. As the needed permutation of the columns of the Nj in the two cases of 
j = 1,2 is dependent only on the maximal sets X, a common a may be chosen. 
Finally, the case j = 3 follows from equating eigenvalues in the two expressions 
giving diagonalizations for SiA , to see that for all i 

so m| and n^m are scalar multiples of one another. □ 

This lemma quickly yields a special case of Kruskal's theorem, when two of the 
matrices in the product are asumed to have full column rank. 

Corollary 2. Suppose (Mi,M 2 ,M 3 ) is of type (r;0,0, r-2); N X ,N 2 ,N 3 are ma- 
trices with r columns; and [M i , M 2 , M 3 ] = [Ni , N 2 , N 3 ] . Then there exists some 
permutation matrix P and invertible diagonal matrices Di with D\D 2 D 3 = I r such 
that Ni = MtDiP. 

Proof. Since {M\,M 2 ,M 3 ) is also of type (r;0,0,r — 1), we may apply Lemma [T] 
As in the proof of that lemma, we may also assume Mi = M 2 = l r . But M3 has 
K-rank at least 2, so every pair of columns is independent. Thus the maximal sets 
of indices in Lemma [1] are all singletons. Thus with P acting to permute columns 
by a, the one-dimensionality of all eigenspaces shows there is a permutation P and 
invertible diagonal matrices D\, D2 with Ni = MiDiP = DiP for j = 1, 2. 
Thus [Mi,M 2 ,M 3 ] = [Ni,N 2 ,N 3 ] implies 

[7 rj / r ,M 3 ] - [DxP, DiP, N 3 ] = [D U D 2 ,N 3 P T ] = [I r , I r , N 3 P T D\D 2 ], 

which shows M 3 = N 3 P T D\D 2 . Setting D 3 = (D\D 2 )~ l , we find N 3 = M 3 D 3 P. 

□ 

We now use the lemma to give a new proof of Kruskal's Theorem in its full 
generality. Note that the condition on the a, stated in the following theorem is 
equivalent to Kruskal's condition in [10] that (r — ai) + (r — a 2 ) + (r — a 3 ) > 2r + 2. 

Theorem 3 (Kruskal, |10J). Suppose (Mi,M 2 ,M 3 ) is of type (r; 01,02,03) with 
ai + a 2 + a 3 < r — 2; N\, N 2 , N 3 are matrices with r columns, and [M±, M 2 , M 3 ] = 
[JVi, N 2 , N 3 ]. Then there exists some permutation matrix P and invertible diagonal 
matrices Di with D\D 2 D 3 = I r such that N — MiDiP. 

Proof. We need only consider 01+02 + 03 = r — 2. We proceed by induction on r, 
with the case r = 2 (and 3) already established by Corollary[5] We may also assume 
Oi\ < a 2 < a 3 , We may furthermore restrict to a 2 > 1, since the case a\ = a 2 = is 
established by Corollary [2] 

We first claim that it will be enough to show that, for some 1 < i < 3, there is 
some set of indices J C [r], 1 < \J\ < r — — 2, and a permutation a £ & r such 
that 



(1) 



({m)} jeJ ) = ({K U) } jeJ ). 
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To see this, if there is such a set J , assume for convenience i = 1 (the cases 
i = 2, 3 are similar), and the columns of Mi, N have been reordered so that a = id 
and J = [s] . Let II be a matrix with nullspace the span described in equation |T]) . 
Then 

[nM u M 2 , m 3 ] = n *i [m u m 2 , m 3 ] = n *i [n u n 2 , n 3 ] = [njvi, n 2 , n 3 \. 

But since the first s columns of IIMi and IliVi are zero, these triple products can 
be expressed as triple products of matrices with only r — s columns. That is, using 
the symbol '~' to denote deletion of the first s columns, 

[UMi,M 2 ,M 3 ] = [ILNi,N 2 ,N 3 ]. 

For i — 2,3, since Mi has K-rank > r — ai, the matrix Mi has K-rank > min(r — 
ai,r — s). Since the nullspace of II is spanned by the first s columns of Mi, and 
Mi has K-rank > r — ai, ones sees that IIMi has K-rank > r — s — ai, as follows: 
For any set of r — s — a\ columns of IIMi, consider the corresponding columns 
of Mi, together with the first s columns. This set of r — ai columns of Mi is 
therefore independent, so the span of its image under II is of dimension r — s — a\. 
This span must then have as a basis the chosen set of r — s — ai columns of IIMi, 
which are therefore independent. Thus [IIMi, M2, Ms] is of type (r — 5501,62,63), 
where bi — max(0, — s) for i = 2,3. Note also that s < r — ai — 2 implies 
ai + b 2 + 63 < r — s — 2. _ _ _ 

We may thus apply the inductive hypothesis to [IIMi, M2, M3] = [IL/Vi, N 2 , N 3 ], 
and, after an allowed permutation and scalar multiplication of the columns of the 
Ni, conclude that Mj = Ni for i — 2,3. But this means we can now take the 
set J described in equation {T]) to be a singleton set {j}, with j > s, and i = 2. 
Again applying the argument developed thus far implies that, allowing for a possible 
permutation and rescaling, all but the jth columns of M 3 and N 3 are identical. As 
m!j = n^, this shows M3 = N 3 . Applying this argument yet again, with i = 3, 
and varying choices of j, then shows Mi = Ni and M 2 = N 2 , up to the allowed 
permutation and rescaling. The claim is thus established. 

We next argue that some set of columns of some M,-, Ni meets the hypotheses 
of the above claim. 

Let LT3 be any matrix with nullspace ({n?}i<i< Ql+a2 ), spanned by the first ai+a2 
columns of N 3 . Let Z be the set of indices of all zero columns of n3M3. Since every 
set of r — a 3 — ai + a 2 + 2 columns of M3 is independent, \Z\ < ai + a 2 . Note also 
that at least 2 columns of n3M3 are independent, since the span of any ai + a 2 + 2 
columns of n3M3 is at least 2 dimensional. 

Let iSi,iS2 be any disjoint subsets of [r] such that |<Si| = a 2 , \S 2 \ = ai, Z C 
Si U 52 — S, and S excludes at least two indices of independent columns of 113^/3. 
Let LTi = IIi(iSi) be any matrix with nullspace ({ml}^^, and let n2 = n2(6>2) 
be any matrix with nullspace ({mf}i e s 2 ). 

Now consider 

[niMi,n 2 M2,n 3 A/3] = n 3 * 3 (n 2 * 2 (it * x [m x ,m 2 ,m 3 })) 

= n 3 * 3 (n 2 * 2 (ni *i [Ni,n 2 ,n 3 ]))) = [niiVi,n 2 iv 2 ,n3iV3]. 

By the specification of the nullspace of II3, the columns of all Ni with indices in 
[ai + a 2 ] can be deleted in this last product. In the first product, one can similarly 
delete the columns of the Mi with indices in S, due to the specifications of the 
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nullspaccs of IIi and II2. Using ' ' to denote the deletion of these columns, we 
have 

(2) [IliMi,Il 2 M 2 ,U 3 M 3 } = [IL^, Il 2 N 2 , IL 3 N 3 ], 

where these products involve matrix factors with r — a\ — a 2 = a 3 + 2 columns. 

The matrix Hi Mi in fact has full column rank. To see this, note that it can 
also be obtained from Mi by (a) first deleting columns with indices in S 2 , then (b) 
multiplying on the left by IT , and finally (c) deleting the columns arising from those 
in Mi with indices in S\. Since Mi has K-rank at least r — ai, step (a) produces 
a matrix with r — eti columns, and full column rank. Since the nullspace of IT is 
spanned by certain of the columns of this matrix, step (b) produces a matrix whose 
non-zero columns are independent. Step (c) then deletes all zero columns to give a 
matrix of full column rank. Similarly, the matrix Tl 2 M 2 has full column rank. 

Noting that II3M3 has no zero columns since Z C S, we may thus apply Lemma 
[T]to the products of equation ([2]). In particular, we find that there is some a £ & r 
with cr([r] x S) = [r] \ [ai + a 2 ] such that if X is a maximal subset of [r] \ S with 
respect to the property that ({n 3 mf } ie /) is 1-dimensional, then 

(3) Mi..".-;'}, / = <{n,< w } i6 i> 

for 3 = 1,2,3. 

Since we chose S to exclude indices of two independent columns of II3M3, there 
will be such a maximal subset X of [r] \ S that contains at most half the indices. 
We thus pick such an X with \X\ < \_(r — a\ — a2)/2j = \a 3 /2\ + 1, and consider 
two cases: 

Case ai = 0: Then 1S2 = 0, and II2 has trivial nullspace and thus may be taken 
to be the identity. Since 03 > a 2 > 1, this implies \X\ < 03 — r — a 2 — 2. The sets 
{m|}i S x and {n^-jliex therefore satisfy the hypotheses of the claim. 

Case ai > 1: Note that \X\ + a 2 + 1 < [a 3 /2\ + a 2 + 2 < a 2 + a 3 + 2 = r - ai, 
so for any index k, the columns of Mi indexed by X U Si U {k} are independent. 
This then implies that for 3 = 1 the spanning set on the left of equation ((3|) is 
independent, so the spanning set on the right is as well. Thus the set {n^^},gi 
is also independent. Note next that equation (J3]) implies that, for i £ X, there are 
scalars , c l k such that 

(4) (l) - b ) m ) = c fe m fc- 

Now for any p £ Si, q £ 1S2, let 

Si = (Si x M) U {q} 7 S' 2 = (S 2 \ {q}) U {p}. 

Choosing and H' 2 to have nullspaces determined as above by the index sets 
S[ and«S 2 , and applying Lemmafflto [U'^i, IL 2 M 2 , II3M3] = [I^Ni, W 2 N 2 , U 3 N 3 ], 
similarly shows that for some permutation a 1 and any i' £ X there are scalars d\ , f k 
such that 

(5) n^-^dj'm^^/fm, 1 . 

Note that since the same II 3 was used, the set X is unchanged here, and a and 
a' must have the same image on X. Picking i' £ X so that cr'(i') — cr(i), and 
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subtracting equation (J4j) from ((5j shows 

5>}-4')mi= ^ (/|'-4)mi+4X-4 m P- 

jer fce5i\{p} 

But since the columns of M\ appearing in this equation are independent, we see 
that p q = Cp = 0. By varying p, we conclude that nL^ G ({m, 1 },^!). Thus 
({ n a-(i) }*ez) ^ ({ m i}i6i)- Since both of these spanning sets are independent, and 
of the same cardinality, their spans must be equal. Since \2\ < r — a\ — 2, the set 
X satisfies the hypotheses of the claim. 

□ 
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